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Abstract — We study the information rates of non-coherent, 
stationary, Gaussian, multiple-input multiple-output (MIMO) 
flat-fading channels that are achievable with nearest neighbour 
decoding and pilot-aided channel estimation. In particular, we 
analyse the behaviour of these achievable rates in the limit as the 
signal-to-noise ratio (SNR) tends to infinity. We demonstrate that 
nearest neighbour decoding and pilot-aided channel estimation 
achieves the capacity pre-log — which is defined as the limiting 
ratio of the capacity to the logarithm of SNR as the SNR tends 
to infinity — of non-coherent multiple-input single-output (MISO) 
flat-fading channels, and it achieves the best so far known lower 
bound on the capacity pre-log of non-coherent MIMO flat-fading 
channels. 

I. Introduction 

Coherent multiple-input multiple-output (MIMO) flat-fading 
channels have a capacity that increases with the signal-to- 
noise ratio (SNR) as min(nt, n^) log SNR, where rit and n,- are 
the number of transmit and receive antennas, respectively [IJ, 
||2|. This capacity growth can be achieved using independent 
and identically distributed (i.i.d.) Gaussian inputs with nearest 
neighbour decoding. The nearest neighbour decoder is a simple 
decoder that selects the codeword that is closest to the channel 
output. In a coherent channel with additive Gaussian noise, this 
decoder is the maximum-likelihood decoder and is therefore 
optimal in the sense that it minimises the error probability 
(see ID and references therein). However, the coherent channel 
model assumes that there is a genie that provides the fading 
coefficients to the decoder, which is difficult to achieve in 
practice. We exclude the role of the genie by studying a 
scheme that estimates the fading via pilot symbols. Note 
that with imperfect fading estimations, the nearest neighbour 
decoder that treats the fading estimate as if it were perfect is 
not necessarily optimal. Nevertheless, we show that, in some 
cases, nearest neighbour decoding and pilot-aided channel 
estimation is optimal at high SNR in the sense that it achieves 
the capacity pre-log. The pre-log is defined as the limiting ratio 
of the achievable rate to log SNR as SNR tends to infinity. 
The capacity pre-log is defined in the same way but with the 
achievable rate replaced by the capacity. 
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The capacity of non-coherent fading channels, where the 
receiver has no knowledge of the fading coefficients, has been 
studied in a number of works. Building upon ||4l, Hassibi and 
Hochwald [5] studied the capacity of the block-fading channel 
and used pilot symbols (also known as training symbols) 
to obtain reasonably accurate fading estimates. Lozano and 
Jindal |6| provided tools for a unified treatment of pilot-based 
channel estimation in both block and stationary bandlimited 
fading channels. In these works, lower bounds on the channel 
capacity were obtained. Lapidoth (7) studied a single-input 
single-output (SISO) fading channel for more general fading 
processes and showed that, depending on the predictability of 
the fading process, the capacity growth in SNR can be, inter 
alia, logarithmically or double logarithmically. The extension 
of Q to multiple-input single-output (MISO) fading channels 
can be found in |8|. A lower bound on the capacity of MIMO 
fading channels was derived by Etkin and Tse in |9|. 

Lapidoth and Shamai IfTOl and Weingarten et al. IfTH 
studied non-coherent fading channels from a mismatched- 
decoding perspective. In particular, they studied achievable 
rates with Gaussian inputs and nearest neighbour decoding. 
In both works, it is assumed that there is a genie that provides 
imperfect estimates of the fading coefficients. 

In our work, we add the estimation of the fading coefficients 
to our analysis. In particular, we study a communication 
system where the transmitter emits at regular intervals pilot 
symbols, and where the receiver performs channel estimation 
and data detection, separately. Based on the channel outputs 
corresponding to pilot transmissions, the channel estimator 
produces estimates for the remaining time instants using a 
linear minimum mean-square error (LMMSE) interpolator 
Using these estimates, the data detector employs a nearest 
neighbour decoder to decide what the transmitted message 
was. We study the achievable rates of this communication 
scheme at high SNR. In particular, we study the pre-log for 
fading processes of bandlimited power spectral densities. 

For SISO fading channels, using some simplifying argu- 
ments, Lozano lfT2l and Jindal and Lozano |6| showed that 
this scheme achieves the capacity pre-log. In this paper, we 
prove this result without any simplifying assumptions and 
extend it to MIMO fading channels. If the inverse of twice 
the bandwidth of the fading process is an integer, then for 
MISO channels, the above scheme is optimal in the sense that 



it achieves the capacity pre-log derived by Koch and Lapidoth 
lISl . For MIMO channels, the above scheme achieves the best 
so far known lower bound on the capacity pre-log obtained in 

The paper is organised as follows. Section |II] describes 
the channel model and introduces the encoding and decoding 
scheme. Section Hill defines the pre-log and presents the main 
result. And Section |IV] outlines the proof of this result. 

II. System Model 

We consider a discrete-time x rit MIMO flat-fading 
channel, whose channel output at time instant G Z (where 
Z denotes the set of integers) is the complex-valued nj- 
dimensional random vector given by 



Yk = 



SNRj 

n-t 



(1) 



Here Xk £ C"' denotes the time-fc channel input vector (with 
C denoting the set of complex numbers); Hfe G C"' denotes 
the fading matrix at time k; and Zk G C"' denotes the additive 
noise vector at time k. 

The noise process {Zk,k G Z} is a sequence of inde- 
pendent and identically distributed (i.i.d.) complex Gaussian 
random vectors of zero mean and co variance matrix l „^, where 
is the rif x rii identity matrix. SNR denotes the average 
SNR for each received antenna. 

The fading process {Mk,k G Z} is stationary, ergodic and 
Gaussian. We assume that the jij- • processes \^Iik{j'^ t)^ k G 
Z}, r = 1, . . . , rii , t = 1, . . . , rit are independent and have the 
same law, with each process having zero-mean, unit-variance 
and power spectral density ///(A), — i < A < i. Thus, /h(-) 
is a non-negative function satisfying 

»l/2 



Hk+mir,t)Hl{r,t) 



-1/2 



fH{X)d\ (2) 



where (•)^ denotes complex conjugation. We further assume 
that the power spectral density ///(•) has bandwidth Ad < 
1/2, i.e., /i/(A) = for |A| > A_d and //^(A) > otherwise. 

We finally assume that the fading process \Mk , k G Z} and 
the noise process {Zk,k G Z} are independent and that their 
joint law does not depend on {xk, k G Z}. 

The transmission involves both codewords and pilots. The 
former convey the message to be transmitted, and the latter are 
used to facilitate the estimation of the fading coefficients at 
the receiver. The codeword is selected from the codebook C, 
which is drawn i.i.d. from a zero-mean unit-variance complex 
Gaussian distribution. The codeword is assumed to satisfy the 
average-power constraint 



1 ^ 

-^E[||X„(m)f] <nu meM 



(3) 



where 7W = {l, . . . , e^''^} is the set of possible messages, 
and N and R denote the codeword length and the coding rate. 

To estimate the fading matrix, we transmit orthogonal pilot 
vectors. The pilot vector pt used to estimate the fading 



coefficients corresponding to the t-th transmit antenna is given 
by pt{t) — 1 and pt(i') = for t' ^ t. For example, the first 
pilot vector is pi — (1,0, ••• ,0) , where (•)^ denotes the 
transpose. To estimate the whole fading matrix, we thus need 
to send the rit pilot vectors pi, . . . , Pn^ ■ 

The transmission scheme is as follows. Every L time 
instants (for some L G Z), we transmit the rit pilot vectors 
Pi, . . . , p,n ■ Each codeword is then split up into blocks of 
L — rit data vectors, which will be transmitted after the rit 
pilot vectors. The process of transmitting L — rit data vectors 
and Tit pilot vectors continues until all N data vectors are 
completed. Herein we assume that N is an integer multiple 
of i - ntQ Prior to transmitting the first data block, and 
after transmitting the last data block, we introduce a guard 
period of i(T — 1) time instants (for some T G Z), where we 
transmit every L time instants the rit pilot vectors pi , . . . , p„j , 
but we do not transmit data vectors in between. The guard 
period ensures that, at every time instant, we can employ a 
channel estimator that bases its estimation on the channel 
outputs corresponding to the T past and the T future pilot 
transmissions. This facilitates the analysis and does not incur 
a loss in terms of achievable rate. The above transmission 
scheme is illustrated in Figure [T] The channel estimator is 
described below. 

Note that the total block-length of the above transmission 
scheme (comprising data vectors, pilot vectors and guard 
period) is given by 



(4) 



where iVp denotes the number of channel uses for pilot vectors, 
and where N^^ denotes the number of channel uses during the 
silent guard period, i.e.. 



N 



iVp = + 1 + 2(T - r 

\L-nt 
A^un = 2(L-nt)(r-l). 



nt 



(5) 
(6) 



We now turn to the decoder Let T> denote the set of time 
indices where data vectors of a codeword are transmitted, 
and let V denote the set of time indices where pilots are 
transmitted. The decoder consists of two parts: a channel 
estimator and a data detector. The channel estimator considers 
the channel output vectors Y^, k ^ V corresponding to the past 
and future T pilot transmissions and estimates Hk{r, t) using 

^ (T) 

a linear interpolator, i.e., the estimate H^, (r, t) of the fading 
coefficient Hk{r,t) is given by 

HPir,t)^ y] a,,{r,t)Yu.{r) 



k+TL 

E 

k' = k-TL: 
k'eV 



(7) 



where the coefficients ak' {r, t) are chosen in order to minimize 
the mean-squared error 

'if A'^ is not an integer multiple of L — nt, then the last L — nt instants 
are not fully used by data vectors and contain therefore time instants where 
we do not transmit anything. The thereby incun'ed loss in information rate 
vanishes as N tends to infinity. 
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Fig. 1. Stnicture of pilot and data transmission for nt = 2, L = 7 and T = 2. 



Note that, since the pilot vectors transmit only from one 
antenna, the fading coefficients corresponding to all transmit 
and receive antennas (r, t) can be observed. Further note that, 
since the fading processes {Hk{r, t),k G Z}, r — 1, . . . , rir, 
t = 1, . . . ,nt are independent, estimating Hk{r, t) only based 
on {Yfe(r), fc G Z} rather than on {Yfe, fc G Z} incurs no loss 
in optimality. 

Since the time-lags between Hfc, k E V and the observations 
Yk' , k' E V depend on k, it follows that the interpolation error 

£;f)(r,i)=i/fe(r,i)-ijf)(r,i) (8) 

is not stationary but cyclo-stationary with period L. Neverthe- 
less, it can be shown that, irrespective of {r,t), the variance 
of the interpolation error 

alTii,r,t)^E |i/fc(r,t)-7?f)(r,i)|' (9) 

tends to the following expressions as T tends to infinity |fT3l 



lim <^(^,r,0 

T-i-oo 



1 - 



1/2 

1/2 SNR/ff„o(A) 



SNR|.fe,.£(A)| 



-dX 



(10) 
(11) 



where £ = k mod L denotes the remainder of k/L. Here 
Ihi^A-) is given by 



L-l 

is 

3=0 



Ih 



(12) 



and /ir(-) is the periodic function of period [—1/2, 1/2) that 
coincides with ///(A) for -1/2 < A < 1/2. If 

1 



L < 



2Xt 



then becomes 



-i<A<i. 
2 - - 2 



(13) 



(14) 



In this case the interpolation error is given by 



1 



1/2 



SNR(/h(A))' 



-dX, e = 0,...,L-l (15) 



_i/2 SNR/ff(A)+i 

which vanishes as the SNR tends to infinity. Recall that Ad 
denotes the bandwidth of fni )- Thus, (fT3T l implies that no 
aliasing occurs as we undersample the fading process L times. 

The channel estimator feeds the sequence of fading esti- 
mates {HJ^ , fc e V} (which is composed of the matrix entries 



{Hj^^\r, t),k e V}) to the data detector We shall denote its 
realisation by {H^^\fc G V}. Based on the channel outputs 
{yk,k G V} and fading estimates {H^ ,fc G 2?}, the data 
detector uses a nearest neighbour decoder to guess which 
message was transmitted. Thus, the decoder decides on the 
message rh that satisfies 



rh — arg min D{in) 



(16) 



where 

- - VSNR/nt Hf )a;fc(TO)||' (17) 

kev 

and where || • || denotes the Euclidean norm. 

III. ThePre-Log 

We say that a rate is achievable if the error probability 
tends to zero as the codeword length tends to infinity. In this 
work, we study the maximum rate i?*(SNR) that is achiev- 
able with nearest neighbour decoding and pilot-aided channel 
estimation. We focus on the achievable rates at high SNR. In 
particular, we are interested in the maximum achievable pre- 
log, defined as 



Hjf = lim sup 



i?*(SNR) 



(18) 



SNR^co log SNR 

The capacity pre-log — which is given by (fTST l but with 
i?*(SNR) replaced by the capacity C(SNR)— of SISO fading 
channels was computed by Lapidoth [7| as 



nc = /i({A:///(A)=0}) 



(19) 



where /i(-) denotes the Lebesgue measure on the interval 
[—1/2,1/2]. Koch and Lapidoth |8| extended this result to 
MISO fading channels and showed that if the fading processes 
{Hk{t), k G Z}, t — 1, . . . ,nt aie independent and have the 
same law, then the capacity pre-log of MISO fading channels 
is equal to the capacity pre-log of the SISO fading channel 
with fading process {iJfe(l), k G Z}. Using (fT9] l, the capacity 
pre-log of MISO fading channels with power spectral density 
of bandwidth A ^ can be evaluated as 



1 -2A 



D- 



(20) 



Since i?*(SNR) < C(SNR), it follows that Ur. < Uc- 

To the best of our knowledge, the capacity pre-log of MIMO 
fading channels is unknown. For independent fading processes 
{Hk{r, t),k E Z}, t = 1, . . . ,nt, r = 1, . . . ,n,- that have the 
same law, the best so far known lower bound on the MIMO 



pre-log is due to Etkin and Tse ||9l 

He > mm{nt,n^)(l - mm{nt,n^)fi{{X: /h(A) > 0}) 

(21) 

For power spectral densities that are bandlimited to Ad, this 
becomes 

He > min(nt, rir) (l — min(nt, rir) 2Ad) . (22) 

Observe that (l22l l specialises to ( |20l l for n,- = 1. It should 
be noted that the capacity pre-log for MISO and SISO fading 
channels was derived under a peak-power constraint on the 
channel inputs, whereas the lower bound on the capacity pre- 
log for MIMO fading channels was derived under an average- 
power constraint. Clearly, the capacity pre-log corresponding 
to a peak-power constraint can never be larger than the 
capacity pre-log corresponding to an average-power constraint. 
It is believed that the two pre-logs are in fact identical (see 
the conclusion in [7|). 

In this paper, we show that a communication scheme that 
employs nearest neighbour decoding and pilot-aided channel 
estimation achieves the following pre-log. 

Theorem 1: Consider the above Gaussian MIMO flat-fading 
channel with rit transmit antennas and receive antennas. 
Then, the transmission and decoding scheme described in 
Section achieves 

min(nt, n-[) 



Ilfl. > min(nt,ni.) 1 - 



L* 



(23) 



where L* is the largest integer satisfying L* < 2j^. 

Proof: Due to page limitations, only an outline of the 
proof is given in Section |IV] ■ 

Remark 1: We derive Theorem [T] for i.i.d. Gaussian inputs 
satisfying the average-power constraint (|3). Nevertheless, us- 
ing truncated Gaussian inputs, it can be shown that Theorem 
[U also holds when the channel inputs have to satisfy a peak- 
power constraint, i.e., with probability one \Xk\ < 1. 

If 1/(2Ad) is an integer, then (l23T l becomes 

Ur* > min(nt,nr)(l - min(nt,ni.)2A_D). (24) 

Thus, in this case nearest neighbour decoding together with 
pilot-aided channel estimation achieves the capacity pre-log of 
MISO fading channels ( l20l l. as well as the lower bound on the 
capacity pre-log of MIMO fading channels (l22l) . 

Comparing (l23T l and (|22] | with the capacity pre-log 
min(nt,fir) for coherent fading channels |2|, we ob- 
serve that, for a fading process of bandwidth A^, the 
penalty for not knowing the fading coefficients is roughly 
( min(rit, fir)) 2Ad. Consequently, the lower bound ( |23] | does 
not grow linearly with min(nt,nr), but it is a quadratic 
function of min(nt,fir) that achieves its maximum at 

L* 

min(nt, rii) = — . (25) 



This gives rise to the lower bound 

L* 
T 



(26) 



which cannot be larger than l/(8Au). The same holds for the 
lower bound (ISTT i. 

IV. Proof Outline 

We first note that it suffices to consider the case where rit = 
rij.. If Ut > rij., then we employ only nj- transmit antennas, and 
if rij. > Ut, then we ignore n,- — antennas at the receiver 
This yields in both cases a lower bound on the achievable rate. 

To prove Theorem [T] we analyse the generalized mutual 
information (GMI) for the above channel and communication 
scheme. The GMI, denoted by /^'"'(SNR), specifies the high- 
est information rate for which the average probability of error, 
averaged over the ensemble of i.i.d. Gaussian codebooks, tends 
to zero as the codeword length N tends to infinity (see fSl, 
[lOJ , [llj and references therein). 

Let e[,^' denote the estimation error in estimating H/j, i.e., 

(T) 



is composed of the matrix entries Ej^\r,t) dHJ. Then, 



for the above channel model, the GMI can be evaluated as 



where 



/s""(SNR) = sup ( 6'B(SNR) - k(0,SNR) 



L — rit 



B(SNR) = 7 E E 



vsmj 



nt 





2 " 







(27) 



(28) 



(with II • \\p denoting the Frobenius norm); and where 
k(6', SNR) is the conditional log moment-generating function 
of the metric D{m') associated with an incorrect message — 
conditioned on the channel outputs and on the fading 
estimates — which is given by 



k(6',SNR) 

L-nt 



^SNR^ 



Y,, 



logdet l„, -0 



,SNR^ 



Following H4J it can be shown that for 9 

-1 



oyI ( l„. 



,SNR, 



nt 



< 0. 



(29) 



(30) 



As observed in lfT4l . the choice 9 — .^..^ , ^, j — 

yields a good lower bound at high SNR. Here 



= max E 

r,tl 



(31) 



Substituting this choice to the right-hand side (RHS) of dZTl ). 
and applying ( l30t to upper-bound k(6', SNR), we obtain 



/s""(SNR) 

L-rit 



logdet l„,. 



snr: 



ir(T)Mt(T) 



ntn-^ + ntrii-SNRcTg, j. 

L-nt 



(32) 



We continue by analysing the RHS of ( [32] i in the Umit as 
the size of the observation window T of the channel estimator 
tends to infinity. To this end, we note that, for L < 2x7:' the 
interpolation error tends to (ITSt . namely 

"1/2 SNR(/^(A))^ 



— lim 

T-i-oo 



= 1 - 



dX. (33) 



Similarly, since by the orthogonality principle H^"^'' and E^^-* 



are independent, and since all entries in 
it follows that 



have unit variance. 



lim (1 



1/2 



-1/2 



SNR(/g(A))^ 
SNR/h(A)+L 



dX. (34) 



We thus have by ( |34] | that, irrespective of £, the estimate 

(T) ~ 

I) tends to H in distribution 



ntUr 



ntn,.SNRcr^. 



(35) 



as T tends to infinity, where the entries of H are i.i.d., 
circularly-symmetric, complex Gaussian random variables 
with zero mean and variance 1 — cr'^,. Consequently, since 
the function A ^ logdet(l + A) is continuous and bounded 
from below, we obtain from Portmanteau's Lemma [15] that 



lim E 

T->oo 



log dot 



snr: 



> E 



log det \n. 



TltTlr + nt^rSNRcTg. 

SNRHHt 



ritn^ + TitTii-SNRcTg, 
which yields the following lower bound on the GMI: 



(36) 



lim /s""(SNR) 



> 



> 



log det I 



SNR 



L 



log det 



SNR 




L - nt, 



L 

L ~ nt 
L 



Here the second step follows by lower-bounding log det (I + 
A) > log det A; and the third step follows by evaluating the 
determinant and by using that, by our assumption, nt = n^. 
To compute a lower bound on the pre-log 

/s""(SNR) 



TltTlr + TltTlrSNRcTg. 

nt logSNR-ntlog (nt^ + Tit^SNRa.^.) 

+ E [log det Iff] -1 ). (39) 



IIr. = lim 



(40) 



SNR^oo log SNR 
we first note that, by [16], EpogdetHllI'] is finite. We further 



note that 

which implies that log (Tit^ + Tit^SNRug*) is finite, too. Thus, 
computing the ratio of the RHS of ( [39l ) to log SNR in the Umit 
as the SNR tends to infinity, we obtain the lower bound 



= min(nt, n^) 1 



min(Tlt, TT-r) 

L 



L < 



2A 



(42) 
(43) 



D 



where we have used that Tit = = min(nt, Tir). The 
condition L < 1/{2Xd) is necessary since otherwise (fTsT i 
would not hold. This proves Theorem [T] 
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